Monitoring network activity

ABSTRACT

A system for analysing network traffic, particularly to detect suspect packets and identify attacks or potential attacks. Data packets which meet defined criteria are detected and their details forwarded to a database server where the details are stored so as to be accessible for use in analysis in conjunction with the details of other detected packets. Packet detection uses a tap and a packet factory which creates a packet for analysis consisting of the received packet and a unique identifier. A series of adapters are used to apply functions to different parts of the packets, to detect those meeting the criteria.

PRIORITY CLAIM

The present application claims priority to GB Patent Application No. GB0022485.7 dated 13 Sep. 2000, entitled “MONITORING NETWORK ACTIVITY”.

The present invention concerns a system for monitoring network activity,and in particular is concerned with detecting potentially damagingtraffic on a network.

Hacking into computer systems is a major problem facing users ofnetworks. Attacks by hackers may for example be aimed at readingconfidential information, at the destruction of data or at preventing asite from operating properly. For example there have been many instancesof Distributed Denial of Service (DDoS) attacks in which a large numberof computers are used to bombard a site simultaneously, thus preventingnormal activities. To deal with hacker problems a number of “IntruderDetection Systems” (IDS) have been proposed, both commercially and underOpen Source licences, but these have not been capable of dealingsuccessfully with the amount of traffic generated by a major DDoSattack. A typical IDS system includes a so-called “sniffer” whichanalyses packets. Known IDS systems are designed as monolithic systemsaround a single computer which is assumed to act as the securityanalyst's console or as an alert system. When considering the bandwidthavailable to high-profile sites, it becomes clear that designing arounda single computer on a LAN is of limited use.

Another problem is the detection of slow scans which are normally wellbelow the warning threshold for IDS systems due to the often random timelags between events. Slow scans are of significant interest as the levelof sophistication and dedication required clearly points to much morecapable intruders than the usual heavy scans most IDS's detect which arethe work of so-called ‘script kiddies’.

An object of the system described herein is to provide a high-speedIntrusion Detection System (IDS) which will allow users to detecthostile network activity and take action based both on real-timeinformation and correlation with historical data. It is specificallygeared to reducing false positives, detecting subtle attacks andmanaging attacks rather than simply blocking them. The system istargeted at high-bandwidth sites such as large companies, data centres,co-location sites or e-commerce companies. There are a number ofinventive aspects of the system disclosed in the present specification.

Viewed from one aspect there is provided a system for analysing networktraffic, comprising the steps of using detecting means to detect datapackets which meet criteria defined by one or more functions in thedetecting means, forwarding details of the detected packets to dataprocessing means, and storing details of the detected packets so as tobe accessible for use in analysis by the data processing means inconjunction with the details of other detected packets.

Thus, in accordance with this aspect of the system disclosed in thepresent application, a database can be used to store detected packets.An analysis can therefore be carried out to determine whether, forexample, a pattern emerges which might suggest a slow scan by a hacker.If such a pattern emerges, then the system may produce new functions sothat the detecting means can detect appropriate packets—such as thoseoriginating from the hacker—and the necessary action taken. Thus, forexample the detecting means may be set up to identify packets which aretrying to access an unused port. If such a packet is detected, it isforwarded to the data processing means. Here it may be determined thatthis particular type of packet may have been encountered only rarelybefore, or not at all, in which case its details are stored but nofurther action is taken. However, the analysis may establish an emergingpattern suggesting that a slow scan is in operation, and action may thenbe taken. This aspect of the invention provides significant advantagesover known systems where lack of historical data limits the analysisthat can be done.

In one preferred embodiment the detecting means comprises a tap whichreceives packets of data from network traffic, and packet creating meansor “packet factory” which for each received packet creates a packet foranalysis which consists of the received packet and a unique identifier.The unique identifier may include an identifier for the tap and a timestamp. Preferably, an adapter is applied to a packet, giving a view onthe contents of the packet. Packets meeting defined criteria areidentified by applying functions to the view supplied by the adapter. Inpreferred embodiments, a plurality of adapters are provided which applyfunctions to different parts of a packet. Preferably, the functionsapplied to the packets are encoded into a function tree.

The method of detecting packets is inventive in its own right and thusviewed from another aspect the invention provides a system for detectingpackets in network traffic which meet predetermined criteria, comprisinga tap which receives packets of data from network traffic, and packetcreating means which for each received packet creates a packet foranalysis which consists of the received packet and a unique identifier.

Viewed from another aspect, there is provided a system for detectingdata packets having specified characteristics in network traffic,comprising the steps of using detecting means to detect packets whichmeet criteria defined by one or more functions, wherein the detectingmeans comprises first means for copying all network data to second meanswhich classifies packets in the data, information concerning the packetsbeing transmitted to third means which applies at least one function tothe information to determine whether the packets meet the specifiedcriteria.

In accordance with these aspects of the systems disclosed in the presentspecification, the speed with which the process of detection can becarried out in increased significantly. A conventional system uses“sniffers” in the data lines which have been programmed in advance toidentify certain types of packets in the stream of data. In accordancewith this aspect of the invention, however, the first means will be inthe data lines and will forward all data to the second means where thefirst stage of analysis takes place and packets are classified. Thesecond means can be programmed to identify only specific packet types oreven just a single packet type, and the packet types to be classifiedcan be altered without interrupting operation of the system. In apractical application, a number of units would be provided, each withits own first, second and third means and each looking only for specificpacket types. This increases the speed with which potentially dangerouspacket types can be detected.

The combination of the above two aspects of the inventive systemdescribed in the present application is particularly advantageous.

Viewed from yet another aspect, there is provided a system for detectingdata packets having specified characteristics in network traffic,comprising the steps of using detecting means to detect packets whichmeet criteria defined by one or more functions in the detecting means,wherein the detecting means comprises first means for copying networkdata to function applying means which analyses the data by applying atleast one function, and wherein the at least one function may be variedwithout interruption of copying network data for analysis.

It would be possible to have a system in which, for example, packetdetection is carried out at a site, detected packets are thentransmitted to a remote database for processing, and then results arereturned to the original site for action to be taken.

A significant feature of the preferred system is the ability to performhistorical analysis and correlation on the traffic. This makes itpossible to build up profiles of both attackers and attacks. Ahistorical record is also extremely useful when prosecutingperpetrators.

Viewed from another aspect of the inventive system disclosed, there isprovided a system for detecting and reacting to a denial of serviceattack over a network, in which a data processing system is programmedto detect a denial of service attack and to automatically re-routelegitimate network traffic using additional bandwidth.

Viewed from another aspect of the inventive system disclosed, there is aprovided a system for detecting data packets having specifiedcharacteristics in data traffic, in which multiple detecting means areused so as to provide information from multiple nodes, and data from themultiple nodes is correlated.

Viewed from another aspect of the inventive system disclosed, there is aprovided a method of detecting data packets having specifiedcharacteristics in network traffic, comprising the steps of usingdetecting means to detect packets which meet criteria defined by one ormore functions in the detecting means, forwarding details of detectedpackets in the network traffic to data processing means, analysing thepackets in the data processing means with reference to details ofprevious packets stored in a database, and if appropriate in accordancewith the analysis generating new functions for use by the detectingmeans.

Viewed from another aspect there is provided a method of analysingpackets of data from a network comprising the steps of receiving detailsof packets in data processing means, storing the packet details in adatabase accessible by the data processing means, and processing newlyreceived details of packets by comparing them with information stored inthe database in order to detect potentially harmful network activity.

The systems described need not be restricted to the detection of harmfulactivities by hackers. Functions could be chosen to check outgoingtraffic from a site, thus looking for confidential information beingtransmitted or other unauthorised activity.

Other aspects include data processing means programmed to carry out themethods outlined above, and software which when running on dataprocessing means will enable the methods to be carried out. Suchsoftware may be provided on a physical carrier such as a CD ROM or maybe provided from a remote location such as a site connected to theInternet. The software may be only in respect of a particular part ofthe system, such as a database.

The preferred system in accordance with the above aspects is gearedtowards detecting subtle attacks and to managing attacks, as opposed tojust blocking them. The objective is to allow users to detect hostilenetwork activity and take action based both on real-time information andcorrelation with historical data. The system is targeted primarily, butnot exclusively, at high-bandwidth sites such as large companies, datacentres, co-location sites or e-commerce companies. There is adistributed design where the packets are “sniffed” off the network byone or more sniffers at wire speed and analysed in real-time on aseparate computer. The system is designed to monitor all the links intothe site concurrently, merging information about attacks from all linkssimultaneously. A Java interface to access the information runs on oneor more security analysts' computers. The preferred system is readilyscalable and has the ability to monitor multiple links.

Preferred embodiments of the invention will now be described by way ofexample and with reference to the accompanying drawings, in which:

FIG. 1 is a diagram showing how the system monitors multiple links intoa site;

FIG. 2 is an overview of an embodiment of the system;

FIG. 3 shows details of a packet detecting part of the system;

FIG. 4 shows an example of a list of packet adapters for an ethernetlink;

FIG. 5 is a schematic of a pair of nodes in a function tree;

FIG. 6 is an example of a function tree;

FIG. 7 is a schematic of the objects running in an embodiment of thesystem;

FIG. 8 illustrates the use of a logger and an alerter;

FIG. 9 shows the path of a packet from capture to submission to thedatabase;

FIG. 10 illustrates a restart process for alerters;

FIG. 11 is a schematic of feedback loops in the determination of alertsand reactions; and

FIG. 12 illustrates a communication path.

The system described is a high-speed Intrusion Detection System (IDS)built around Compaq Non-Stop (Trade Mark) technology.

The bandwidth for major sites is already measured in Gbit/s, with thelargest web-hosting companies being fed by multiple SONET fibre pairs atOC-48 (STM-16) speeds, 2.488 Gbit/s, or above. At these speeds itbecomes necessary to consider a distributed design where the packets are“sniffed” off the network by one or more sniffers at wire speed andanalysed in real-time on a separate computer. The present system isdesigned to monitor all the links into the site concurrently, mergingthe information in real-time. A Java interface to access the informationruns on one or more security analysts' computers.

Although the system described is entirely network based, it can alsoinclude host-based information, e.g. information from log files, “whois”databases, standard XML (SNML-Standard Network Markup Language) or otherIDS products via suitable CORBA interfaces.

Every component of the system is a CORBA object (Common Object RequestBroker Architecture, a standard developed by the Object Management Group(OMG)), which allows the management interface to locate and interactdirectly with the different components. Every object is designed toallow configuration at run-time through said Java interface and supportsa configuration stored in the central database.

The system consists of three main components:

1. Packet sniffers, which take a copy of every single packet of trafficon the wire, and perform initial high-speed selection and analysis;

2. A database engine, which stores and analyses both real-time data andhistorical information providing alerts and reactions;

3. A Java front-end, which provides the interface between the analystand the system, displays information and allows sophisticated queries onthe data, as well as reconfiguration of the system.

The large incoming bandwidth is monitored by as many sniffers asnecessary, each taking a part of the traffic, and then forwardingsuspect packets to the database engine. There the data is analysed inreal-time for known attacks and stored for historical correlation to bedone in near real-time. Alerts are sent to the Java interface and alsoby other mechanisms such as GSM phone, pager, etc. The differentcomponents of the system are connected by a secure private network or asecure VPN system over a shared network.

Apart from detecting known attack patterns and illegal packets, thesystem also supports the concepts of hot lists and white lists. A hotlist is a list of attack patterns, packet signatures or source addressesthat have been noted in recent suspect packets and demand specialscrutiny. An example is a source that has probed unused ports: such asource is entered onto a hot list to raise an alert if there are moreprobes, which would point to systematic information gathering. Hot listsare usually updated automatically, but can be added to manually. Whitelists work on the principle that anything that is not specificallypermitted is suspect. For example the connection to a web server on port80 is explicitly permitted, but an attempt to telnet to the machine isautomatically seen as suspect and is noted.

The use of white and hot lists is especially advantageous in that itmakes it possible to detect suspicious packets without relying on apattern, but purely on the basis of source or destination of a packet.As most packets are logged on the basis of a pattern, this is a hugeadvantage, as the development of patterns always lags the development ofnew attacks. In a further development of the system, patterns of normaltraffic will be built up and there will be detection of packets thatseem abnormal. As an example, if most packets to a specific service aresmall, say below 512 bytes, and suddenly a 4 k packet comes along (adefragmentation module would pick up the size of large packets), thenthere is a possibility that this is an attempted buffer overflow. NormalIDSs could not log these packets, as they work on the principle that asuspicious packet generates an alert, so that this type of behaviourwould generate far too many alerts. With the present system such packetscan be logged and only reported if more suspicious traffic is detected.The detection of such traffic is also crucial for the development of newattack patterns.

For both of these mechanisms to be useful in practice, it is essentialthat the lists can be updated through remote calls to the sniffers atrun-time. In fact, the ability to tune and reconfigure the system atrun-time is one of the design goals and motivated the use of CORBA asmiddleware. It is essential that these lists are kept updated, and theremay therefore be a periodic automated systems scan to verify that nonetwork changes have taken place or to update the current lists.

A major problem with conventional IDSs is that they often lead to alarge number of false alarms. A common remedy is to raise the level atwhich alerts are raised, which means that subtle information gatheringgoes undetected. The present system will not raise alerts when simpleprobes are detected, but remembers these and raises an alarm if asuspicious combination of events following the probes is seen.

The question of response is complex and extends into legal issues. Itwould be particularly useful if it were possible to stop attacks atsource, but this might require actions which violate the laws of one ormore countries traversed by the intruder. There is therefore a need fora localised response which would, in the example of DDoS, bring onadditional bandwidth on separate autonomous systems re-routinglegitimate traffic or other strictly local means of damage limitation.For sophisticated attackers, who need to be managed rather than justblocked—otherwise they would only be back in a different guise—apossible solution is to re-direct them to a fake honey pot system,giving time to gather more information about them.

Referring now to the drawings, FIG. 1 shows how the system monitorsmultiple links into a site so that it can correlate intrusions viamultiple routes. In this and FIG. 2, the system is referred to as “K2Defender”. The figure shows multiple users, ISP's, NAP and a User site.

FIG. 2 is an overview of the system. Sniffers read the data from theincoming pipes and report relevant packets to a central “K2 Defender”logging server hosting a database, which in turn reports attacks to acustomized reactor. A secure interface server provides an interface toall components of the system for configuration and interrogation, and isassociated with a User GUI. A firewall is also shown.

FIG. 3 is a schematic of the main objects in a sniffer. Packets arecaptured from the network and an initial filtering step is done in atap. A packet factory creates a CORBA data structure from a memory pooland inserts it into a buffer. Worker threads, communicating with theadapter factory, pick up the packets and process them through a functiontree. If the packets need to be logged, they are again buffered, fromwhere a priority queue picks them up and transmits them to a loggerusing a “boxcar” technique, and incorporating burst suppression.

The sniffer is based on a high-speed capture engine which copies packetsoff the wire. The sniffer needs to be sophisticated enough to reduce theamount of data logged to a minimum, without deleting valuableinformation. Functions geared to detecting DDoS attacks will e.g. lookat the frequency of certain packets and only report that these packetsare part of a denial of service attack, rather than logging all theindividual packets.

To ensure that attackers cannot break into the system, the onlyconnections to the public Internet system are at the sniffers and thereturn wires are cut, so that no packets can ever be sent through theinterface connected to the incoming data link. A second interface ontothe private network is used to communicate with the database server.

Under normal circumstances sniffers are stateless, as this makes itpossible to write filters with lower memory and CPU requirements. Todetect some types of attacks it is useful to have some state in thesniffer, and here the distributed nature of the present system isvaluable, as it is possible to dedicate a separate sniffer to statefulfunctions. An example where statefulness is important is in monitoringthe number of dropped packets when a session is being usurped at thestart of a man-in-the-middle attack, or in detecting the large number ofhanging connections in a denial of service attack against a specificmachine. Similarly a DDoS based upon a large number of half-openconnections could be detected and reacted upon by sending a reset (RST)for every port opened on the attacked machine.

Another area where statefulness is useful is to detect sudden peaks inthe number of packets directed at specific hosts or specific ports. Itis very easy to keep these statistics and raise and alert when changesin traffic patterns are detected even if the individual packets seemharmless, and are not logged to the database.

To make the system easily extendible, the packet sniffer contains noknowledge of specific types of packets. Rather the packet factorycreates structures from the data read off the wire that are CORBA types,so that they can be transmitted easily. Consequently the system caneasily be extended for different protocols, by adding a modified tap anda different packet factory.

An initial filtering step is performed at this stage.

At this point a unique identifier consisting of

1. the unique number of the sniffer (2 bytes) and,

2. a time stamp (8 bytes)

are added to the packet. The resolution of timers on most machines isnot good enough to make the (sniffer, timestamp) table unique, hence thetime is modified artificially to give a unique nanoseconds value since 1Jan. 2000. The seconds are calculated as Julian seconds plus a secondfraction with whatever resolution the hardware clock provides. Toresolve the uniqueness problem, a sequence number counter is reset atthe start of every interval that represents the system's timerresolution, e.g. 1 ms. As new packets are received, the counter isincremented and added to the time as a nanosecond value, thus making thetimestamp unique.

On IP networks a capture library, libpcap, which is used in the tap tocapture packets, can filter packets. This makes it possible to discardsome local traffic (such as ARP), if desired, and also to split workover several sniffers, by e.g. dedicating one sniffer to TCP traffic andanother to all other traffic. With some network cards filtering can evenbe done at the hardware layer with appropriate patches to libpcap.

It is imperative that the sniffer does not start dropping packets athigh data volumes. Consequently the path in the initial thread in thesniffer is kept very short. A packet is received off the wire into astatic buffer; the packet factory copies it into a structure allocatedfrom a memory pool, and puts a pointer to this structure into a buffer.The buffer grows dynamically to accommodate peaks in traffic.

Timestamps on data sent by the sniffer will be generated from at leasttwo secure NTP sources distributed throughout the private Defendernetwork. These are based on DES authentication technology which,although slightly out-dated still provides a rather serious challengefor high-speed key recovery. NTP supports public-key cryptography whichwill be used for initial key exchange and key refresh. Data on thesniffer is checksummed and a secure digest is generated by SHA-1. Thisis encrypted with a sniffer-specific key. This is added to the snifferdata.

Security Issues

When an attack is detected, the attack data can be sent off to a trustedthird-party, which is assumed to act as a notary for this data andshould apply relevant security measures, e.g. printing out data onreceipt in readable format, further secure timestamping using aninternal source, cryptographically strong signing. Given the amount ofdata which might be sent from each of the sniffers it is not feasible tosend all the collected data to a trusted third party. Rather, onlyrelevant Alerts, with the associated packets, will be sent from thedatabase as opposed to each and every packet.

There is no way to ensure that the data being received by the sniffer isvalid. If fake data is being generated to cause particular fabricatedalerts to incriminate someone it is impossible to know at the snifferlevel or later. Furthermore, it is also very difficult to provide secureauthentication at the sniffer level which will stand legal examinationin court. Secure signing and/or encryption at the sniffer level requiresa private key to be stored on the system. There are a number ofavailable “cryptography cards” for both Intel (Trade Mark) andAlpha-based (Trade Mark) systems which might make this possible. One ofthe key issues which is addressed by using a trusted third party isguaranteeing the sequencing and timestamping on Alerts. Components areunder the control of the system it is feasible to be able to faketimestamps or add data to the database without it being generated bysniffers. By sending data to a trusted third party as soon as possible(and only if relevant) it is possible to attempt to minimise theopportunities for a legal challenge.

Packet Adapters

The initial filtering of the packets on the sniffers is performed withthe assistance of a set of packet adapters. These objects overlaystructures onto parts of a packet to provide easy access to a variety ofrelevant variables in the packet header. FIG. 4 is an example of a listof packet adapters for an ethernet link. A TCP/IP packet is tunnelled inan IP data stream. Every adapter overlays the correct data structureonto the header section, and contains a pointer to the beginning and endof its section, to the end of the packet and to the next adapter ifpresent. For the example of a TCP/IP packet that is tunnelled over an IPethernet link, there would be four packet adapters:

1. A datalink layer adapter for the ethernet header

2. A network layer adapter for the first IP header

3. A second network layer adapter for the tunnelled IP header

4. The protocol layer adapter for the TCP header

Every packet adapter has a static size and is allocated transparentlyfrom a memory pool; it also contains a factory to generate a packetadapter for all types of adapters that can follow it. This approachkeeps access to the different parts of a packet simple and flexible. Itis easy to add adapters to accommodate for example IPv6 with its simpleinitial header and variety of extension headers. The list of packetadapters is built in a lazy fashion, i.e. is extended on demand.

Packets are tagged for logging, or other analysis, by applying a set ofrules to each packet, which for efficiency are encoded into a functiontree. Each rule maps onto a path through the tree from the root node toone of the leaves. Many rules are very similar and share many of thechecks that must be applied to each packet, and so the use of a treestructure allows these checks to be performed much more efficiently thana naive application of each rule in turn. The tree is organised in sucha way that common conditions are tested early, in order to reduce thenumber of tests that need to be applied to a packet, and to keep thenumber of times that they are applied to a packet to a minimum. Thisapproach is used throughout the system.

The function tree is designed to be highly flexible and yet easilyconfigured, with a wide variety of fully interoperable node types, eachof which applies a particular kind of check on the packets that passthrough it. The tree is constructed automatically from the XMLspecifications of the rules that it implements. The addition and removalof specific rules from the tree is transparently translated into thenecessary addition, deletion or reconfiguration of specific nodes, allof which can be performed at run-time whilst the tree is being used.

The flexibility of the function tree derives from the flexibility of thedecision nodes, each of which performs the following actions on eachpacket:

1. The appropriate packet adapter is added if required, e.g. if the nodeis configured to check the TCP destination port, a TCP adapter is addedif not already present;

2. If a hash of exclusive functions is available, the hash key iscreated and a node picked from the hash table. These must be mutuallyexclusive conditions, e.g. specific destination port numbers;

3. If a list of conditional functions is present, a list of guards isapplied and, if a guard returns true, the packet is passed to theassociated child node.

Variations on this theme exist, e.g. boolean nodes, which have anif-then-else decision structure to choose between conditional childnodes. Additionally link nodes provide connectivity and host thefunctions that are applied to every packet.

Whenever a packet is passed to a function or a child node, these returna logging value, indicating whether the function should be logged to thedatabase or not. The log values become larger with increased severity ofthe threat, and if the maximum value is reached at any point, allfurther tests are ignored and the packet is returned immediately forlogging.

FIG. 5 is a schematic of a pair of nodes in the function tree. Linknodes contain a list of functions and a list of decision nodes, that areexecuted whenever the node is reached. Decision nodes contain a list ofchild nodes depending on a specific, exclusive condition for a packetfield, or depending on a non-exclusive condition on a packet. On theright hand side, a Link node contains functions with the Dest. portbeing irrelevant, and a link to a node checking the Dest. port. The DestPort Node contains nodes to functions depending on a specific dest port(e.g. 80) or nodes to functions depending on dest port ranges.

FIG. 6 is an example of a very simple function treethat tests for apacket for a particular DDoS attack client command (matching “>”) and a“QAZ” worm infection.

Packets are processed through the function tree by multiple threads,which pass on any packets with a non-zero log level to a queue fordispatch to the logger (indicated by the second buffer of FIG. 3).Packets are read from this queue and, in order to reduce the load on thenetwork and messaging overhead, are sent to the logger in groups(‘boxcars’). These boxcars fill with packets as they are arrive from thefunction tree and are periodically dispatched to the logger. Theintervals at which boxcars are dispatched is chosen on-the-fly, and isreduced as high log-level packets are entered into the boxcar. This wayreset limits on the time a packet can be delayed in the boxcar can behonoured. The thread that sends each boxcar to the logger will notrelease the memory holding the packets until it receives a returnmessage from the logger indicating that they have successfully beenlogged.

As mentioned above, most sniffer functions are stateless to increase thespeed at which packets can be processed. Thus every packet is seen as anindividual packet without context, and consequently a crude port scanover all ports of a machine will trigger an alert for everynon-white-listed port on the system. While it may be desirable to keepsome statistics on the port scan, it is futile to log all the individualpackets and may even be detrimental to the overall health of the system.In some cases, the easiest solution is to add stateful functions to thesniffer.

In other situations a system could be flooded with a large number ofdifferent attack packets, in effect a denial of service attack.

In order to avoid over-loading the system with large numbers of attackpackets, a burst management system is introduced. The boxcars in thesniffer are configured to be fairly large, so that they will only bedispatched on a timer pop under normal circumstances. If they fill upbefore the timer pops it signifies that a large number of threateningpackets are coming in. Instead of sending the whole boxcar off, all thepackets are again processed by a function tree. Serious threats are leftin the boxcar and sent to the logger, but for other patterns,representative packets are chosen and a separate alert is generated. Forexample, for a portscan, the range of the portscan and the number ofpackets is kept, and only a single representative packet is stored inthe database.

The sniffer contains a management interface that is a CORBA object.Consequently a sniffer can be located through the CORBA naming service,and its individual components can be configured, suspended or activated,and tuned to specific traffic types, through the Java interface atruntime. In particular, functions can be loaded and configured atruntime, which is particularly beneficial should an analyst wish tostudy traffic which looks suspicious but to which no function is exactlytailored.

As it is possible to interrogate and manage individual functions throughCORBA from the Java interface, it is also possible to extend thecapabilities of a sniffer by having functions that collect statistics onpackets, for example, rather than vetting them for attacks.

The configuration for sniffers, including the default set-up for thefunction tree is stored in the database and served by a configurationserver. The parameters for individual functions are stored in XML andall standard XML tools can be used to view/edit and determine thevalidity of these strings. For the manageability of such a complexsystem, which needs to run a 24×7 service, it is of utmost importancethat the configuration is stored centrally in a database, so thatsniffers can be reconfigured without taking the system down.

The design of the sniffer is influenced by the performance requirements.Space for captured packets is allocated from a memory pool, which isbetween 4 and 45 times faster than the system allocators on variousknown machines. The initial path of the data to the first buffer is veryshort, and thus fast. Great care has been exercised to avoid all memorylocks, as these caused unacceptable overhead. Due to the dynamicallyexpanding buffers, the sniffer can cope with very high loads, withoutpackets being lost.

Key points are therefore that the system filters data to detectsuspicious packets and reduce logging;

dynamically configurable;

easily extensible design;

stores configurations in a database in XML;

supports multiple priorities for logging packets; and

scalable through the use of threads.

The database is the core of the system and has three levels ofprioritised service:

1. real-time: network attack detection, alert and network response(Electronic Counter-Measures), urgent pattern modification in responseto attack;

2. near real-time: historical correlation, data mining in response toanalyst queries, pattern modification; and

3. batch: pattern update, slow scan detection, data cleaning, periodicreport preparation.

As the database system is the central point in the system, this is themain area in which bottlenecks are to be expected. When coping withinput from multiple sniffers on multiple high-speed links, enormousamounts of data may have to be processed and stored. At present, thepreferred system capable of coping with such demands, and scaling withInternet-style growth in traffic, is the Compaq Himalaya (Trade Mark)platform.

CORBA provides scalability in the number of servers that can be deployedaround the preferred platform. For example it is possible to connectmultiple sniffers to the Himalaya and farm out reactions over multiplereactors.

There could however be scalability problems for the processes on thedatabase server, and this is where the characteristics of the preferredplatform are valuable. All objects in the database server are replicatedinside transaction monitors, so that there are a configurable number ofprocesses available to perform a specific function. The operating systemdoes automatic load-balancing, so that requests go to the CPU with thelowest load. To get the full advantage of this architecture, objectsneed to be context free as far as possible. If context needs to be kept,it can be kept in the database.

The way in which the preferred platform caches tables, means that accessto a database table can be as fast as access to shared memory. Thedatabase itself can run over multiple processors and multiple disks inparallel and is uniquely scalable.

Another advantage of the distributed design is that it is very easy toadd new objects with new functionality to the system—often withouttaking the rest of the system down. For example, new data series serversor surveyors can be added without taking any of the other processesdown.

FIG. 7 is a schematic of the objects running on the preferred system.The figure shows the logger receiving packets from sniffers, linked viaan inserter to the database. The logger is also linked to the alerter,together with a stats logger. The alerter is linked to a reactor, whichproduces customised reactions, and to a query server. The reactor islinked to a messenger which can send messages to the user interface,e-mail, pagers etc. The query server is linked to the database and to asurveyor which in turn is linked to the reactor, an alerter linked tothe reactor, and a data series server. The data series server is linkedto the database and to the alerter. Also linked to the database is aconfiguration manager.

A dilemma when storing network packets is that a large number of packetstructures exist, so that it is impractical to define a separate tablefor every type of packet. This suggests that it may be better to storeall packets in a single table and to avoid splitting out variables fromthe packet headers into separate columns. However this is not apractical solution, as it makes querying the table very hard, andlargely nullifies the advantage of using a database. Consequently, acompromise solution is required.

For the average IP network the majority of packets will be TCP packets,with a fair number of UDP and ICMP packets sprinkled in between. Thusthe default configuration will contain four tables:

1. TCP packets

2. UDP packets

3. ICMP packets

4. All other IP packets

In every table as much useful information is broken out into separatecolumns as possible. For example for all packets IP header information,such as the source and destination addresses is available in separatecolumns. For TCP packets further information from the TCP header isavailable, such as ports, TCP flags as well as indicators for thepresence of specific TCP options. Breaking out a lot of informationmakes it possible that even complex queries do not have to look at theraw packets.

The primary key for these tables must be constructed entirely from thepacket information received from the sniffer, so that objects receivinga packet via a fast-path can pinpoint it in the database. The uniquepacket information that the packet factory attaches to every packetprovides enough information for a primary key.

When querying for packets that are not in a specific table, e.g. apacket that is tunnelled over IP, the IP table needs to be scanned andsome processing on the raw packet may be required. In general this isnot a big issue, as the IP table should be small. Some sites may,however, carry a large amount of tunnelled traffic, in which case tablesfor IPIP or IPsec may be required.

In general non-IPv4 networks need to be supported as well and tables forthese have to be inserted. This mandates that the design mustaccommodate easily adding new tables and allocating data to these. Byusing the general function tree concept throughout, e.g. when decidingwhich function to use to write a specific packet to the database, thisflexibility is built into the system.

The bulk of the front-line work is to receive raw packets from thesniffers and file them in the database. Along with the raw packets thedatabase will receive an indication of how serious the threat from aspecific packet is considered to be. High priority packets are reportedvia method invocations to the Alerter process, as shown in FIG. 8. Froma POA thread the Logger first reports high priority messages to theAlerter, where they are received by a receiver, buffered, and sent to aprocessor. The Logger then sorts and partitions all packets so that theyare sent in boxcars to server classes responsible for updating specifictables. In this Example there are boxcars for TCP, UDP and ICMP, anInserter for TCP and a combined Inserter for UDP/ICMP. In order to avoiddelays, the Alerter simply stores the packets in the internal buffer andreturns immediately.

Subsequently the Logger needs to write the packets to the database. Dueto the required flexibility in the number of different tables, this mayseem quite a daunting task. Furthermore, SQL queries on the preferredplatform are currently process blocking, which is unacceptable in aprocess such as the Logger. In practice the latter problem provides thesolution to the first.

In order to be able to write packets to the database concurrently, it iscommon practice on the preferred system to have a separate class ofCORBA servers, the Inserters, that execute the SQL queries. In otherwords, every table will have its own Inserter implementation, but theywill all implement a common CORBA interface.

Although this seems to have complicated the insertion process, it iseasy to accommodate this flexibility in the Logger. Using the samefunction tree that is implemented for the sniffer, it is easy to definethe appropriate XML to split the packets by protocol and submit them tothe correct inserter.

Again, packets are written to a boxcar, and once the boxcar is full or atimer pops, the whole boxcar is sent to the correct Inserter.

FIG. 9 illustrates the path of a packet from capture to submission tothe database. The boxcar thread in the Sniffer only releases the boxcarof packets when they are safely written to the database. If any callfails in any part of the chain, the packets are resent. There are showna tap thread, a farm thread, a boxcar thread, a logger thread, aninserter thread and an alerter thread. In the tap thread, there iscapture, filtering and the creation of a packet structure which iswritten to a farm buffer. In the farm thread, a function tree is appliedto the packet and a log level determined. The packet is written to theboxcar and the timeout adjusted for the log level. In the boxcar thread,the system waits for the boxcar to fill up or the timer to pop. There isbranch through burst suppression if required. A remote call is made tothe logger. In the logger thread, if it is high priority it is sent tothe reactor. It is copied into a buffer in the Alerter thread, and thereis an early return. In the logger thread, there is determined theInserter to which the packet is to be sent. The packet is sent to aboxcar, and when the boxcar is full or the timer pops, it is sent to theinserter thread. In that thread, a transaction is started, the packet iswritten and the transaction is committed. There is decrement of thecounting semaphore by the number of packets per thread, and when thecounting semaphore reaches zero the boxcar contents are safely on thedatabase and the boxcar can be freed.

A drawback of using boxcars is that the request thread in the Loggerneeds to wait until all packets have been successfully written to thedatabase, in order to have certainty that no data can be lost.Unfortunately the dispatch of the boxcars is under the control of theboxcar threads. The solution is to tag every packet with an identifierfor the thread that submitted it, and let the main thread wait on acounting semaphore. Once the packets have been dispatched and the methodcall returns, indicating successful insertion of the packets into thedatabase, the boxcar thread decrements the counting semaphore for everypacket associated with a specific tag. Once this semaphore reaches zero,the main thread can return safely with the knowledge that all dataallocated to it is on the database.

The cost of this is low, but under high load, the system can handle muchhigher throughput.

The Alerter is very much a fast-path process to react to emergencies.The Alerter again processes packets in a function tree and attempts tocome to a decision whether a specific packet mandates issuing an alert.This function does not stop processing when a signature is matched, sothat correct alerts are raised, should a packet match more than onesignature. In order to make the decision to raise an alert, queriesagainst the database, issued through the Query Server, can be submitted.How alerts are handled is discussed in the next two sections.

The Alerter needs to do an early return to the Logger, and it is thusnot guaranteed that the Alerter will process the packets. If it dieswhile processing the packets, the associated reactions will be lost.This will be discussed again when describing the functions of theSurveyor. The early return is mandated by the fact that the boxcar withpackets needs to be held on the Sniffer until the call from the Loggerreturns. As determining a reaction can take an indeterminate amount oftime, after all, database queries may be required, it is unacceptablefrom a resource allocation point of view to hold the boxcar on thesniffer until all reactions have been determined.

The delivery of a fast path action is not guaranteed in the case of theAlerter dying, as the Alerter does a quick return to the Logger, and thesuspect packet is only stored in an internal buffer in the Alerter afterthat point. The Alerter is running in a server class, so will bere-started whenever it dies. As its first action it notifies a specialSurveyor-style process that scans all recent packets and checks whetherurgent actions are required. This surveyor will look at allhigh-priority packets over a short time period, and issue alerts forthem as described above. As all alerts are written to the databasethrough a server class of Alert Query servers, de-duping is taken careof there.

This scan could be done from within the Alerter process, but it wouldmake the start-up time for the Alerter very long. Furthermore allAlerters will be requesting this scan if a server class starts up, whichis unnecessary and undesirable. Consequently, it is better to have asingle process that handles the re-scanning, as shown in FIG. 10. Insidethe singleton restarter there is a thread per table (in this case TCP,UDP and ICMP) that manages the request on a separate bank of processes,that are responsible for scanning individual tables and processing thepackets through a tree identical to that found in the Alerter.

FIG. 10 illustrates the restart process for the Alerters. All Alerterscontact the Restarter process, which contains a thread for every table.If the flag is set, the request is ignored, as a process will be busyscanning the table, so that the restart of a server class will onlycause a single scan of the tables.

An advantage of having an IDS integrated with a database, is that it ispossible to keep track of changes to certain classes of machines andadjust the log level of an alert accordingly.

Assume that TCP wrappers are configured so that port 25 (for example)appears to be in the “listen” state even though the machine is, say, aweb server. This allows information gathering on the attacker withoutcompromising security. When the attack is reported, the rules checkagainst the list of mail hosts, see that the machine is not a mail hostand stores the information for later perusal. If however such an attackcomes in, and the machine is a mail host, it may have to be reported tothe analyst immediately. Because in this example the target is a webserver without a real SMTP service listening it is possible to use anSMTP Alert triggered by a fake SMTP service to re-direct the intruder toa honey-pot to further analyse the attack.

Machine classes are defined as sets of IP address and port combinations.For example one could have “webhosts” and “intelhosts”. The problem withthese classes is maintenance. It is envisaged that a network scan willbe run to populate such classes on a regular basis. Ideally they wouldbe maintained by the analyst but this often leads to the classes and thereal machines being out of synchronisation. A regular scan can be usedto populate the classes on a per-service basis and OS fingerprintingtechniques can be used to populate the architecture/OS classes.

An example of the need for an architecture-dependent class would be toraise suitably the log level of shellcode (buffer overflow) alertsdirected to machines with the correct architecture. It is immediatelyobvious that x86 shellcode directed to a SPARC system is a rather futileattempt and that it is probably part of a “random fire” rootkit. On theother hand, the same shellcode directed to a served port on an Intelsystem would require the log level to be raised. An example of thiscould be an FTP daemon buffer overflow.

A further distinction is an OS-dependent class. This would be needed incases where a shellcode exploit was known to work against particulararchitecture/OS combinations. An example is the FTP daemon bufferoverflow which is present exclusively in x86 OpenBSD up to an un-patchedversion 2.8 system. The same BSD-derived FTP daemon runs under DebianLinux on x86 but is not relevant due to alignment differences. A moretrivial example is offered by a Solaris system running an Apache webserver. There is little point flagging as urgent a Windows NT IIS (trademark) attack against a Solaris Apache server.

These machine classes are used in various rules throughout the system asso-called Macros.

As regards the queries that are required to distil information from thedata kept in the database, many of these may be expensive queries, butmany of them can be standardised. To cope with the preferred embodimentsof use data series servers that keep statistics on certain quantitiesare provided. In the system described herein, five types of data serieshave been defined, but due to the distributed nature of the system, itis easy to add more.

In one embodiment the current data series all collect data over10-minute time intervals, and keep distinct statistics for everysniffer.

Port attempts: The number of connection attempts that triggered alertsto certain well-known ports.

Source-based: The number of packets logged that originated from a set ofwell-known hosts, i.e. hosts that we have in our hot-list as suspicious.

Dest-based: The number of packets logged that are sent to a set ofwell-known hosts, i.e. hosts which are in one of our defined classes(see page 14).

ICMP packets: Jumps in the numbers of ICMP Dest Unreachable or, evenmore interesting, Source Quench, packets.

Alert numbers: The number of alerts for certain classes of alerts (e.g.alerts) that were raised.

The data series server (DSS) recalculates these statisticsintermittently with a slight delay (the system needs time to put thepackets into the database). The data series servers can be configured sothat they do not all update at the same time, to reduce contention onthe database. Every data series server will send the newly calculateditems to an Alerter that looks at the data series items, may do someadditional queries (see on queries), and generates new alerts ifnecessary. In the function tree in the alerter, it is possible to puthooks, so that certain classes of packets are sent to specificSurveyors. These will do additional checks against the database andgenerate new alerts if necessary.

The data series servers also act as servers to supply the latest valuesof the various aggregates that are calculated. They do not keep thisinformation in memory, but in a database table—with the type of cachingon the preferred platform there is no speed difference, and the code ismuch simpler.

A second set of calculations that the DSS servers supply are means overtime. These are often more useful to see trends than just 10-minutestatistics, and it is also possible that the analyst misses somethingthat happened during a 10-minute time interval, but the averages willstill show a peak some time later. Standard time series averages thatupdate with every new 10-minute time interval are calculated, with meanlook-back times of 30 minutes, 1 hour, 12 hours, 1 day, 1 week, 2 weeksand 1 month.

It could become expensive to store these for every single 10-minute timeinterval—after all, they need to update whenever there was an itemsomewhere in the history for that particular combination of sniffer andparameters. As a compromise the latest averages are stored in adatabase, and whenever a new item comes in, the are updated.Re-calculating the averages over a longer time period is relativelyfast, and can be done when a graph is requested by the analyst.

Keeping the latest averages in a database also makes it quick to restartthe data series server, as it can read the last row of averages andupdate from there.

The different types of data series servers are now discussed in somemore detail.

1. Port-based Data Series

This assumes that the relevant protocol has a concept of ports as in TCPover IP or UDP over IP. The rationale behind port based information isto determine when we are facing concerted attacks on specific services.The number of ports to monitor is not that large, perhaps only about10-20. These statistics are kept per sniffer, and per protocol type(i.e. separating UDP from TCP).

For ports the preferred system only stores statistical data forwhite-listed ports. This makes the port/host table manageable.

Standard open ports assumed in a basic white-list are:

SSH (22)

SMTP (25)

HTTP (80, 8080)

HTTPS (443)

WEBCACHE (3128, 8080)

PPTP (1723)

DNS (53)

IDENT (111)

The first of the host-based statistics is to detect accesses from hoststhat are known suspicious, i.e. are in the systems hot list. Ifincreased activity from one of these hosts is detected, this isimportant information even if the individual packets look relativelyharmless. An example is port scans, which are not usually reported tothe analyst. If port scans are detected from known bad hosts, this needsto raise an alert or be used to raise the overall log level of aseparate alert.

This is also essential for Reactor router re-configuration or honey-potre-direction. New attacks from a system in the hot list should beimmediately dealt with by the honey-pot if other traffic is alreadyre-directed to maintain the “illusion”.

The second set of the host-based statistics are attempts per set ofdestination machines. These sets should be something like all web hosts,as discussed in the section on macros. It is reasonable to have machinegroups being stored as opposed to individual machines if the number ishigh, since the question normally is “why is there a jump in WebAlerts?” which is immaterial if the machine is not a web server.

In addition the system should preferably be able to scan the packettable for specific IP addresses efficiently even if not particularlyfast. The kind of query which will be asked is as described below:

An ICMP based data series keeps track of interesting ICMP trafficregarding boxes. The source and destination address, as well as UDP/TCPports from the payload of relevant ICMP packets are stored separately inthe ICMP packet table. It is thus possible to observe jumps in trafficinvolving “Martian” nets, or a large number of strange ICMPre-directions. These statistics are kept by type of ICMP packet.

It is of interest if the site on which the system is running is beingused as a spoofed source for DDoS attacks. In this case a large numberof ICMP Source Quench and/or ICMP Destination Unreachable packets wouldbe received. These could be analysed to identify the destination of theattack and report this to the upstream ISP.

Alert-based data series: Alert-based information is useful to determinewhat type of attacks are seen in specific parts of the system. Alertsare categorised into just a few larger classes (web attacks, mailattacks, etc.) and statistics for every alert class are kept persniffer. This enables the analyst to get a quick overview of the levelof attacks in different parts of the system. Restarting the servers forthese series is straightforward and fast with the data that is stored inthe database.

Alerts are generated by various modules in the system, primarily allAlerter, which processes packets considered very dangerous on a fastpath, the data series alerter, that processes statistics from the dataseries servers and Surveyor-style processes, which trawl the database.

There are only a limited set of alerts, for example about 20. Theseclassify packets into classes such as web-attacks, windows attacks etc.Most alerts thus have two parts to them:

1. The alert id, e.g. web attack

2. The function id, which is essentially a signature key, identifying apacket as containing a specific web attack.

Once an alert has been generated, the packets that led to that alert arelinked to it via a set of mapping tables—one for every type of packettable. As new packets generating the same type of alert are discovered,they are simply linked to this alert, rather than generating a new alertevery time. As the number of packets linked to an alert increases, thealert level can be increased, or a new more serious alert can begenerated and linked to the first alert.

Alerts also have a limited life-span of, for example, 15 minutes. Thisis to ensure that packets do not get linked to very old alerts, andthere is always a current view on alerts.

In the absence of front line alerts, the database engine attempts tomatch a larger set of sophisticated patterns, and if so raises an alertand takes action. These are executed by a set of Surveyor objects. Theseinclude searches for slow scans, looking for similarities betweensuspect packets that point to concerted attacks or informationgathering, as well searching for attack patterns to identify specificroot kits.

There are several versions of these surveyors, all matching differentsets of patterns to the database. Some surveyors are runningcontinuously at low priority and use up any spare CPU capacity to applyrules to the packets in the database, while others are triggered by thearrival of data to do specific types of database searches. Whenever asuspicious pattern is found, the appropriate alert is raised.

There is an unlimited number of surveyor designs depending on theinterests of the analyst. In a preferred embodiment a number of theseare pre-designed in and are:

Reactive Surveyors: in the event of a volume-based alert, for example aPortscan Alert, it might be interesting to check if the Portscan Alertwas being used to mask an underlying, more sophisticated, attack. Onsudden surges of activity a Reactive Surveyor can be started to analysethe Alerts in the time-period considered by the volume-based alert. ThisReactive Surveyor would concentrate its attention on the Alerts whichwere not part of the volume-based Alert.

Slow-scan Surveyors: One of the issues with current IDS systems is thatthe horizon of events is rather restricted. Data quickly accumulates andinteresting parts of it can be drowned in the noise of the scriptkiddies. This surveyor will trawl the database searching for verylow-key events which might be much more dangerous. A determined intruderwill not show himself up with noisy portscans or random port attemptsbut will slowly probe the network for vulnerabilities. A possible designwould be to search for small violations which have no follow-up, forexample an isolated attempt to connect to a non-whitelist port after anICMP Echo Request to the same host. Over a period of months the would-beintruder might collect enough information to then attack the systems.

Off-line Reactive Surveyor: This is identical in principle to theReactive Surveyor mentioned above but searches for underlying stealthattacks during high-Alert activity off-line as opposed to the triggeredresponse. The difference here is that the time period is higher and thesearch scope much larger. In practice it is a faster version of theSlow-scan Surveyor as it only analyses a limited number of time-periods.

Information leakage Surveyor: Along with special sniffer functionscollecting outgoing data of certain forms (e.g. SSH traffic), theinformation leakage of a site can be analysed off-line. For example ifthe out-bound SSH traffic from a site increases dramatically just beforean information leak is discovered then this Surveyor might help infinding the culprit. This can also be linked to tagging rules whereparticular tags are checked in documents. An example would be themonitoring of the movement of a Word Document containing the keyword“2001 Budget” within an organisation. Should it appear on sniffersmonitoring networks where it should not be, then an Alert would betriggered. This Surveyor would collect this kind of data.

Signature collection Surveyor: Another interesting off-line task wouldbe the collection of new signatures. Although it might sound difficultto automatically extract signatures the class-based Alert system allowsfor generic payload checks to be performed. For example in the case of“Whisker” a new variant on the long path trick could be developed. Thismight trigger a Web Alert but only because it was directed to a non-Webhost (and hence falls under the “wrong box, don't care” rule). Then thisSurveyor would check such mis-destined packets and see if they matchanything known in the rules. If not they could report this as somethingpossibly new to investigate.

Router configuration/Honeypot Surveyor: A slow surveyor, trawlingthrough the database to suggest router configuration changes and/orhoney-pot re-direction which could enhance security. For example if aparticular host is often the target of Web Attacks then it might beinteresting to re-direct these to a honey-pot at the router. Althoughthis might be suggested at the Reactor level discussed below, this couldbe a more thorough optimised version (for example using CIDR blockaggregations where possible).

Other Surveyors can be added with relative ease: the key designprinciple is that there should be separate Surveyors for each task asopposed to a huge monolithic one. They are then run with differingpriorities and time intervals on the Compaq Himalaya.

The purpose of the Reactor is to coordinate the behaviour of the systemto the outside world in the face of a stream of alerts offered by theSurveyor(s) and Alerter(s). The Reactor is responsible for determiningwhether and which reaction should be implemented, where one of thepossible actions is to notify the analyst of an event.

1. Receive an alert, and decide whether this is a ‘new’ alert.

2. Decide which reactions need to be implemented on the basis of thisalert.

3. Look at all the actions that have been taken, and decide whether thisaction is a duplicate or not. If it is a duplicate, link the two actionsand return.

4. Log the action to the database.

5. Implement the action.

As multiple sources generate actions, the de-duping in the Reactor willnot always work correctly, so client code must be prepared to deal withthe occasional duplicate action request. The possibility of duplicateactions being generated is small, but, due to the distributed nature ofthe system, cannot be eliminated completely.

The Reactor will take whatever actions are necessary, and will alwayspass all actions to the Message Server, which is responsible forsubmitting them to the user interface, and under certain circumstancesto send alerts to pagers, mobiles etc.

A key consequence of building up a database of patterns for rootkits isthat it is possible to determine whether a specific alert is part ofstandard rootkit attack. The main advantage of this is that the numberof alerts that are reported to the analyst can be significantly reduced,by simply reporting that ‘rootkit X was used from address A with thisset of spoofed addresses’ instead of a large number of individualalerts. As most attacks resulting in a large number of alerts are fromknown rootkits, this results in a significant decrease in the number ofincident reports that the analyst needs to handle.

Pattern matching to recognise rootkits is hard, as fuzzy matching ofalerts to a pattern is required. Attacks are usually accompanied by alarge number of packets with spoofed addresses, and it is necessary todetermine which addresses are spoofed. Furthermore parts of an attackmay get lost, and rootkits increasingly have sources of randomness builtin. Nevertheless it seems possible to identify a large number ofstandard rootkits and derivatives thereof.

A further consequence of being able to recognise rootkits is that thesystem acquires a predictive quality: once part of a rootkit pattern hasbeen matched, it is possible to identify the set of rootkits matchingthis pattern and thus the set of attacks that can be expected. In thatcase it is possible to determine whether these attacks are considereddangerous or disruptive and whether preventative action can be taken(e.g. throttling some traffic at a router if a ping flood is likely).

FIG. 11 is a schematic of the feedback loops in the determination ofalerts and reactions. Packets that satisfy specific signatures raisealerts. All packets, as well as the series of alerts are picked up bydata series servers, which produces statistics, which can in turn leadto new alerts. The statistics, or timers, trigger Surveyor processes,which can trigger alerts in turn.

In FIG. 11 the various dependencies between the different informationflows are shown. High priority packets are sent to the Alerter, and mayraise alerts. These are stored to the database, and passed to theReactor to determine whether notification of the analyst or a specifictype of reaction are warranted.

The Data Series Servers trawl both the packet tables and the alerttables and calculate standardised statistics from these. These valuesare passed to an Alerter for stats data, and may in turn lead to a newalert being raised. Alerts are also passed to Surveyor(s) which can lookfor patterns triggered by an alert, or by trawling the database atperiodic intervals. The surveyors can in turn raise alerts, which arepassed to the reactor.

The analyst receives the alerts that the Reactor deems important, andcan also monitor statistics independently. Should the analyst decidethat intervention is warranted, alerts can be modified (i.e. upgraded ordowngraded in importance) and reactions can be initiated.

On the database server all SQL queries are process blocking, so that itis not advisable to do any queries from a multi-threaded servers. TheQuery server is a bank of single-threaded servers that execute cannedqueries on the database. This enables processes such as the Alerter,Reactor and also the analyst to query the database. It makes it possibleto re-use queries and it simplifies the coding of these servers.

The query server also contains canned queries for the common informationrequired by security analysts when investigating specific events. Thesequeries will include searches for specific patterns, similar attacksetc.

A few examples are:

1. Finding all packets that were logged from a specific host by allsniffers, or one specific sniffer.

2. Finding the host(s) that led to specific alerts being triggered.

3. Given a sudden increase in the number of logged packets directed atspecific ports, producing a histogram representing the specificpacket/port values.

4. Attempting to match an attack pattern to the attacks in a specifictime period. This would make use of pre-stored rootkit patterns with orwithout a certain amount of allowed “fuzziness”.

5. Finding similar Alerts across all sniffers directed to machines inthe same class as the one being reported (if it is a single machine).

Due to the way the query server is implemented, it is very easy to addnew queries to it. When executing canned queries it is much easier tojudge the impact of the query on the total system load, than if usinggeneral SQL queries.

Managing the preferred system needs to take account of the continuouslychanging network environment, as well as having to cope with new attacksand additional monitoring requests. As always there are trade-offsbetween flexibility and speed, and in a real-time system, handling verylarge volumes of data, the trade-offs will generally be biased towardsspeed.

In the preferred system, speed is achieved by having hard-coded unitsperforming certain functions, and flexibility by being able to specifyhow these units are linked together. An example of this approach is thefunction tree discussed earlier.

A second problem is to be able to express the flexibility in theconfiguration of the system in an easy to understand and portable way.Here the flexibility of XML is an advantage. There are many toolsavailable to manipulate XML strings, and in one preferred embodiment thesystem makes use of the Xerces (Trade Mark) libraries for this purpose.

The configuration of servers is stored in XML in the database and theconfiguration server serves this to the processes on start-up. From thefront-end the configuration can also be requested, modified and storedback to the database, in which case it will take effect the next timethe server is restarted. An alternative is to modify the configurationof an object directly and instruct its management object to store theupdated configuration to the database.

In the front-end visual methods for adding rules, as well as translatorsfor rules from other IDS's can be added that generate the desired XMLfor various IDS rules. There is for example a translator from Snortrules to the XML representation available.

Whereas, in the preferred embodiment, the high performance kernel of thesystem is developed using C++, the front-end of the system is developedusing the Java 2 platform. This choice enables the development of ahighly portable and advanced user interface. Since the front-end doesnot need to access private resources of the client machine, it could beembedded as an applet in a web browser.

The main functionalities offered by the front end include:

Basic reporting facilities: A simple, easy to read, display giving anoverall view of the situation with a colour-coded guide to the currentthreat level, for example matching the SANS GIAC colour coding standard;

Sophisticated overview: Packet statistics, attack statistics, snifferload, historical correlations etc.;

Alert analysis: An extension of the above with the additionalpossibility of placing queries directly to the database engine, eitherin SQL or a simplified subset to allow analysts to probe particularattacks or packet patterns;

Maintenance mode: Extending attack patterns, exporting subsets of thedatabase to comma-separated values or IDS interchange formats, such asthe CERT-endorsed SNML in the CVS snapshot of Snort for export todifferent databases (for example Mobile-DB on the Palm (Trade Mark));

Configuration: Configuration of the system by interacting with theindividual components, setting parameters and storing the newconfiguration to the database; and

Interrogating objects: Determining the state of objects interactively,e.g. when querying functions that collect real-time statistics onnetwork traffic.

As is common with user interfaces, the front-end needs to hide theglobal complexity of the underlying system for daily tasks, whileallowing the management of each component when needed. It provides auniform interaction interface for each accessible object, independent ofthe physical location, whilst allowing strict control over the range ofavailable operations.

It remains that the front-end will only enable to interact with thecapabilities offered by the different services. As an example, thedatabase enquiries will strictly follow the set of queries offered bythe Query Server

The design of the user interface preferably follows the standard ofcontemporary user interfaces in term of general ‘look & feel’,internationalisation and reactivity. Preferably, attention is paid tosecurity and roaming access, by employing an n-tier architecture.

Access to the system is based on the notion of roles. A role is relatedto a set of capabilities accessible to a particular group of usersaccessing the system from a particular network. In order to preventattacks on the IDS itself, the front-end gateway can decide to downgradethe access privileges of a certain user accessing from a particularnetwork. Finally, since the personal configuration for each user issaved on the server side, the user will be able to access the differenttools remotely while keeping his standard workspace (with non-localaccess possibly forcing a downgraded role value).

The standard communication path between the system core objects and thefront end is described in FIG. 12.

The front-end architecture relies on a standard n-tier architecture,where each component in the middle-ware architecture plays a specificrole.

The front end gateway is the only entry point from the external network,via the firewall, to the analyst's GUI client. It is responsible for theidentification of users and the management of the different incomingsessions, and in the preferred embodiments the authentication phase willbe delegated to the firewall. It is also responsible for filtering therequests based on the privileges granted to a specific user. Accordingto a privilege schema (the role of the user), it exhibits a specificview on the system objects and forbid some interactions.

A specific subset of the system objects may offer an interface to theuser as well as other external services. This is particularly the casefor the different objects such as the Alert server, the Query server,the Reactor and so on.

Obviously, while these components are loosely coupled with thefront-end, they are not elements of the front-end itself. Indevelopments of the system, there could be other components tofacilitate the work of the IDS analyst (report generator, trafficmonitor, etc.).

With reference to FIG. 12, the user interactions are classified in threemain groups:

Configuration: All of the interactions belonging to this group areforwarded to the Configuration Server, which is responsible forcentralising the configuration of the system and maintaining itspersistence.

Monitoring: In a similar way to configuration, the monitoring requestsare centralised on a system monitor. The monitoring resources thiscomponent provides are lease-based. Each client wishing to follow thestatus of a specific part of the system can subscribe for the associatedset of events for some period. The system monitor is then responsiblefor gathering this information and pushing it regularly to the client.If a lease is not renewed the system monitor will then stop pushing theinformation to the client. When no other clients are interested in thesame information stream, the system monitor will itself un-subscribefrom the object. This subscription mechanism prevents overload of timecritical components while the lease based mechanism avoids pushing theinformation to ghost clients

Direct Access: Finally, for non-periodical interactions some serviceswill propose direct interactions between clients and objects within thesystem. One such object might be a Data Series Server (DSS) which,complementary to real-time subscription, will offer historical dataseries for the purpose of Alert Analysis.

The communication path discussed above enables interaction with thedifferent components of the system in a secure way, assuming thefollowing:

Private network: The internal security network should be separate andreserved to the sniffers and the principal server. Should this not bepossible then each sniffer will run IPsec locally and traffic will bedirected to an IPsec gateway forwarding packets to the principal server.This will be transparent to the application as IPsec routes packets viathe secure link by encapsulating IP.

Firewall: The private network can only be accessed via a firewall, whichblocks any incoming request which is not addressed to the Front-Endgateway. The connection to this gateway is encapsulated via SSL.

Authentication: Authentication takes place both at the access level andat the application level by means of user/password or user/SecureIDcombinations. In the case of application-level security differentauthorisation levels should be defined (read-only, read-modify,read-modify-create, etc. depending on the required granularity).

Since the communication between the front-end gateway and the privatenetwork relies on CORBA, the only legal interactions which are triggeredby the gateway concern the calls to object methods which are part of theexported services. Direct interactions with the internal file system orwith the database will not be supported and simply rejected by thesystem.

The gateway is responsible for storing the different user configurationsand mapping the combinations of user and accessing network to each role.

The front-end enables an analyst to monitor the status of the network.The analyst screen is composed of different zones:

Near real time alert zone: The incoming alerts will be reported in nearreal time through a scrolling list. A colour code will be associatedwith the severity of the alerts. Selecting an alert will open it in thealerts analyser area.

Alert analyser area: The alert analyser displays the further informationregarding an alert (packet, network concerned, reasoning involved)necessary for further analysis.

Knowledge base area: This area will provide further in-depth informationto assist the analyst, including access to security web-sites, mailinglists, etc. In order to further help the analyst, this area will providequick access to the main security source of information. This knowledgebase area will rely on the indexation of the main web sites and mailinglist dedicated to the security.

System monitoring area: the interface will regularly receive statisticalinformation regarding the time critical system objects (i.e. sniffers,logger and reactor) and display them in a graphical way. This area willgive quick access to the configuration interface which is describedbelow.

The alert analyser will display, on request, further informationregarding an alert (packet, network concerned, reasoning involved). Eachalert will be presented together with different associated actions.These actions include possible responses to an intrusion but alsooptions for further analysis of the alert.

Information Gathering

Analysing an Alert mainly requires the gathering of more informationfrom the database via the Query Server, to:

Discover recent Alerts of the same or related class;

Discover Alerts coming from the same source;

Discover Alerts sent to the same destination;

Query the main database with complex Alert-related queries (for exampleisolating sequencing or spoofing artefacts).

In order to evaluate the danger of this threat, it could also trigger aquery to the hosts database or a refresh of this database.Alternatively, the analyst may request the gathering of information fromthe knowledge database in order to check recently reported attackschemes.

A large variety of responses may follow the analysis of an alert. It ispossible to classify the different responses in four main groups:

Passive counter-measure: Firewall adjustment, software updates.

Active counter measure: Redirection to a ‘honey pot’ system,counter-attacks where permissible.

System reconfiguration: In order to eliminate false positives or togather more information. This could imply reconfiguration of thesniffers (monitor all packets coming from this address/network), thereactors (no longer report such alerts/modify the severity), or thesurveyor (rescan for a specific attack pattern in this last week).

Alert report: Report all alert by sending notification to other analystsor CERT.

Preferably, alert cross-checking is possible. The analyst will be ableto tag an alert together with relevant information (checked with apossible interpretation, to further analyse in detail, for example) andto generate a report. On request, the whole set of available informationrelated to an alert could be stored in a safe place in order to avoidthe automatic reduction of data.

Complementary to these services, the alert analyser will offer to“manually” regroup set of correlated alerts. These sets of alerts couldthen be stored as ‘Macro alert’ and be reported at once, but could alsobe used to refine attackers profile.

Each component of the preferred system is responsible for implementingdifferent configuration interfaces. These interfaces enable the end-userto browse the main objects in a similar way.

These different generic interfaces are as follows:

Principal server interface: This enables the setting of some specificconfiguration parameters on the principal server, e.g. to modify theconfiguration of server classes;

Daemon interface: This allows the analyst to start/stop/restart/suspenda specific object;

Status interface: This returns basic information on the object status(active, sleeping), work statistics, number of instances, etc.;

Configuration interface: This enables the retrieval or modification ofthe XML configuration.

As previously described, the configuration of each service will bedescribed in an XML file conforming to different XML schemas whichdefine the required syntax, structure and semantics of XML documents.These configurations are centrally stored in the database managed by theConfiguration Server.

These different capabilities will initially be accessible to a user bymeans of a GUI similar to an advanced file browser in order to be userfriendly and quickly usable. For large-scale organisations, for example,it is posible to offer a view taking into account the location (physicalor logical) of the components, similar to HP OpenView (Trade Mark).

It is important for the performance of the system at wire speed thatthere are no bottlenecks between the actual data path it is monitoringand the IDS core on the database. In one preferred embodiment, thehardware is as follows:

Sniffer: Top-end PC or Alpha workstation, dual 1.2 Gbit/s ATM cards orGBIC/ATM combination (ATM required to interface to the preferredprincipal server), running OpenBSD or a security-enhanced version ofTru64 Unix, large memory to avoid touching the disk for swapping, highspeed internal SCSI disk (U2W) for spill-over;

Principal database server: Compaq Himalaya (Trade Mark) multiprocessor,ATM card for each sniffer, Fast Ethernet cards to serve analystconsoles, ample disk space;

Front-end: Simple PC, not necessarily dedicated.

Optionally the reaction modules and the web server can be moved off theprincipal server on to low-cost PCs running Unix.

The powerful packet collection system described above could also findother uses beyond intrusion detection, and for example:

Company-wide internal security: Sniffers placed in different locationswithin a company's Intranet would make it possible to pick upconfidential data being sent by unencrypted e-mail or disallowed networkusage;

Cyber-nanny: This would allow semi-automatic detection and blocking ofunsuitable sites. The action on detection would simply be to firewallthe site or force redirection of traffic;

Traffic analysis: large sites could use the spare capacity, if any, toanalyse and improve their Internet traffic. It would act as a verysophisticated form of the standard tool mRTG. Indeed, SNMP adapters forthe present system could be developed for this purpose.

The expression “suspect network traffic” used herein is therefore to beconstrued broadly, with regard to the context in which the system isused.

The run-time reconfiguration of the function trees in the Sniffer andLogger requires some special consideration, due to the multi-threadednature of these objects. Frequently the STL data structures used withina particular node of a tree cannot safely be updated by one threadwhilst other threads might still be traversing the sub-tree beneath thatnode. Therefore care must be taken to perform updates of the functiontrees in a thread-safe manner (although, of course, many of theseproblems disappear on the preferred principal server with itsnon-preemptive threads). The problem is further complicated by the needfor efficiency, which rules out the use of mutex locks.

A preferred solution to this problem is as follows. When a nonthread-safe change to the function tree is requested at a particularnode of a function tree, the change is made to a duplicate of therelevant data structure. While this is done the original data structureremains active for all threads arriving at the node; the duplicate isnot accessible to these threads whilst it is being modified. Since thedata structures simply contain pointers to sub-tree nodes, the sub-treeis entirely unaffected by the duplication (in particular, sub-tree nodesare not themselves copied). Then, once the new data structure iscomplete and ready for use, it is made active by a single atomic pointerupdate.

From this point on, any threads which newly arrive at this node of thetree will use the new data structure, and for these threads the changeto the function tree is effective immediately. Threads which werealready traversing the sub-tree of the node at the time the change wasmade will continue to use the original data structure until they leavethe sub-tree altogether, after which they, too, will use the new datastructure. The difficulty is to establish at what time it is safe tofinally the delete the old data structure, i.e. that is, when it isknown with certainty that no threads are still using it.

This is done by the introduction of a function tree worker monitor whichkeeps track of the progress of the threads which are processing datathrough the tree. As each thread enters the function tree, it informsthe worker monitor of the sequence number (or time-stamp, asapplicable), of the data item (i.e. packet, alert or whatever) it isprocessing. Likewise, each thread notifies the worker monitor when itleaves the tree. When a change is made to a particular node of the tree,and after the new data structure has been made active, the old datastructure is ‘time-stamped’ with the sequence number of the ‘mostrecent’ data item to have been sent into the tree, information suppliedby the worker monitor. It is then known to be safe to delete the olddata structure when all the data items currently being processed throughthe tree have sequence numbers later than that recorded at the time thenew data structure was activated. The worker monitor can be requested towait until this condition is true. However, since no attempt is made todelete the old data structure (which takes up only a minimal amount ofmemory) until the next change is made to that node of the function tree,the wait time will generally be negligible.

It is possible, although unlikely, that duplicate alerts can bedelivered from the Reactor. This may be difficult to avoid, andconsequently client processes should be prepared to deal with duplicatealerts, or singleton processes need to be inserted that do any finalde-duping.

The argumentation is as follows:

1. Assume two action sources receive data that leads to the same actionto be generated. Both sources check the database, find no previousversion of this alert and decide to insert it. Before inserting theyretrieve a new action number from a single process that issues actionnumbers.

2. If one source commits a bit earlier than the other, it will call aReactor with the action, the Reactor will scan the database, find noduplicate action, initiate the action and mark it as acted upon. Whenthe second source submits the action to the Reactor, it will see thatthe action has been handled, merge the actions and ignore.

3. If both sources insert and commit the alert at the same time andreport it to two different instances of the Reactor, both will see bothactions as not handled. In a case like this the action id can be used tode-dupe it. The Reactor with the higher action id simply exits andleaves the other to deal with the alert.

4. It may be possible for the two sources simultaneously to generateidentical actions, say with action numbers 315 and 317, but there is adelay in the processing of the action 315. Action 317 is transmitted toa Reactor, who scans the database, discovers no identical action andinitiates it. In the mean-time action 315 is committed and sent to analternate Reactor, who scans the database, sees action 317 but does notsee that it is being acted upon as it has not been marked as such yet.Having an action number that is lower than 317, it decides to initiatethe action and consequently the action is initiated twice.

Duplicate actions will be very rare, but there is nevertheless a veryreal chance that they can occur: especially when a source launchesseveral attacks from the same source, multiple actions leading to anaction request to block the source address may hit the system more orless simultaneously. As the system will be under strain, it is verypossible that the order in which action messages are numbered andreported will change.

Some particular examples of implementing of the system will now bedescribed.

The following is a TCP packet resulting from an NMAP scan, which is usedto identify open ports on systems. This particular packet is recognisedbecause the Ack flag is set, but the sequence number that isacknowledged is 0. The complete hex dump of the packet is also shown.The first 20 bytes are the IP header, followed by the TCP header and thepayload.

2001-08-07 20:23:06 TCP 172.16.1.14:80 −> 195.212.241.243:41363 A 0000 d701  d059  2f17  acd7  01d0  5914  6276  08d7 ...Y/.....Y.bv.. 0010 0145  d701  d701  3c85  49d7  01d7  0130  06a2 .E....<.I....0.. 0020 8cac  1001  0ec3  d4f1  f3b2  a1a1  93b4  c109 ................ 0030 8ad7  01d7  01d7  01d7  01a0  1004  d701  4fbf ..............O. 0040 d701  d701  0303  0a01  0204  0109  080a  3f3f ..............?? 0050 3f3f  d701  d701  d701  d701  d701  d701 ??............

Such probes are the basis of information gathering on target systems.Most port scans are hard scans, i.e. scan a whole range of ports andmachines in a short space of time. Stealthy scans would only probesingle ports with large intervals in between.

The system will log all such scans, but not report them through to theanalyst. When scanning the database recurring source addresses fromsingle probes will be picked up and the source address will be put intothe hot list so that future scans, as well as any following intrusionattempts, will be picked up and logged. This is important, as suchstealthy scans point to more sophisticated hackers.

Fingerprinting a rootkit is difficult: It tends to consist of somerecurring attack patterns and some randomness. A good fingerprint ismade up of a number of distinct elements which are easy to categoriseand then some fuzziness.

A rootkit could be made up of the following patterns for example:

1. “are you alive?” ping (ICMP Echo Request)

2. nmap SYN-FIN scan

3. if POP-3 is open then POP-3 daemon overflow

4. if FTP is open then FTP daemon overflow

5. if IMAP is open then IMAP daemon overflow

6. if PORTMAP is open then PORTMAP daemon overflow

with the slight complication that what follows the nmap SYN-FIN scan canbe in random order. This might seem a hopeless task, given that theabove will be surrounded by lots of other packets but imagine thisdetection sequence where the interval is assumed to be small:

1. Src_(—)2 ping to Dst

2. Src_(—)0 ping to Dst

3. Src_(—)1 ping to Dst

4. Src_(—)3 ping to Dst

5. Src_(—)1 nmap SYN-FIN to Dst

6. Src_(—)2 nmap SYN-FIN to Dst

7. Src_(—)0 nmap SYN-FIN to Dst

8. Src_(—)3 nmap SYN-FIN to Dst

There are individual entries for Src_(—)0 through to Src_(—)3 mapped toa single Dst. Of these sources three out of four are fake (“spoof”)addresses which have nothing to do with the problem. One must be thetrue one otherwise no information would ever flow back.

The surveyor will consider these SYN-FIN scans as one, because the timeinterval is too small to imagine them being separate scans.

Assume that FTP and IMAP are both open ports. Consequently the snifferwill detect the second “series”:

1. Src_(—)1:1024 SYN to Dst:21 (FTP)

2. Src_(—)1:1025 SYN to Dst:220 (IMAPv3)

3. Dst:220 SYN ACK to Src_(—)1:1025

4. Src_(—)0:1024 SYN to Dst:220 (IMAPv3)

5. Dst:21 SYN ACK to Src_(—)1:1024

6. Src_(—)0:1025 SYN to Dst:21 (FTP)

7. Dst:220 SYN ACK to Src_(—)0:1024

8. Src_(—)1:1024 RST to Dst:21

9. Src_(—)0:1024 ACK to Dst:220

10. Src_(—)1:1025 RST to Dst:220

11. Dst:21 SYN ACK to Src_(—)0:1025

12. Src_(—)0:1024 ACK to Dst:21

At this point the analysis is making inroads: the RST packets are usedto indicate that the receiving machine received an unexpected packet.Since there is a SYN and SYN^(˜)ACK pair it means that the initial SYNwas spoofed. This eliminates Src_(—)1 from the game. Src_(—)0 is insteadinitiates a conversation, as shown by the ACK and hence must be the trueperpetrator. This address is inserted into the hotlist.

The pattern that emerges is:

-   -   An ICMP Echo Request, nmap SYN FIN scan followed by FTP and        IMAPv3 connections,    -   It is reasonably to infer that the nmap SYN FIN scan is used to        determine what to do next (i.e. FTP and IMAPv3 since the ports        are open),    -   The rootkit uses 4 spoofed addresses for the initial scan and 2        spoofed addresses for the intrusion attempt.

This can then be entered into a catalogue of “seen patterns”; To collectpatterns of this type it is useful to have machines on the system thatrun TCP wrappers and respond positively to connection attempts. That waythe complete pattern can be found, even if some of the services are notopen on any real system.

From this discussion it is clear that a substantial set of data needs tobe collected to make fairly simple inferences. Matching RST packets toinitial SYN packets is best done in a stateful sniffer, that tracks TCPconnections. By being able to identify at least some of the spoofedsource addresses the analysis becomes much simpler. Piecing theremainder of the fingerprint together requires the data to be availablein the database. Traditional IDSs do not have the capability to collectvolumes of data that do not individually generate alerts and only usethat data to piece together finger prints.

1. A system for analyzing network traffic, comprising the steps of:using detecting means including a tap which receives and selects packetsof data from network traffic and packet creating means which, for eachpacket selected by the tap, creates a modified selected packet foranalysis which consists of the selected packet and a unique identifierfor the selected packet which distinguishes that selected packet fromall other selected packets, wherein the detecting means analyzes themodified selected packets to detect suspect modified data packets whichmeet criteria defined by one or more functions in the detecting means,the criteria being indicative of potentially damaging traffic on thenetwork; forwarding details of each detected suspect modified datapacket to data processing means; storing details of each detectedsuspect modified data packet so as to be accessible for use in analysisby the data processing means in conjunction with the details of otherdetected modified suspect packets; and using the data processing meansto analyze the stored suspect modified data packets.
 2. A system asclaimed in claim 1 wherein the unique identifier includes an identifierfor the tap and a time stamp.
 3. A system as claimed in claim 1 whereinthe tap carries out an initial filtering stage in respect of types ofnetwork traffic in order to select packets of data.
 4. A system asclaimed in claim 1 wherein the modified selected packets for analysisare filtered in order to detect packets which meet the defined criteria,by means of an adapter which enables the application of a function topart of a packet.
 5. A system as claimed in claim 4 wherein a pluralityof adapters are provided which enable the application of functions todifferent parts of a packet.
 6. A system as claimed in claim 1 whereinpackets received and selected by the tap are placed in a static buffer,and the packet creating means copies the selected packets from thestatic buffer into a structure allocated from a memory pool.
 7. A systemas claimed in claim 6 wherein the packet creating means puts a pointerto the structure in a buffer.
 8. A system as claimed in claim 7 whereinthe buffer grows dynamically to accommodate peaks in traffic.
 9. Asystem as claimed in claim 1 wherein a plurality of taps are provided,each with an associated packet creating means.
 10. A system as claimedin claim 1 wherein detected packets are collected and forwarded to thedata processing means in groups.
 11. A system as claimed in claim 10wherein collected packets are forwarded as a group to the dataprocessing means at predetermined time intervals.
 12. A system asclaimed in claim 11 wherein if the number of packets collected exceeds apredetermined limit before the expiry of the predetermined timeinterval, then the collected packets are analysed in accordance withpredetermined criteria to establish whether only some of the packets maybe forwarded to the data processing means as representative of a seriesof like packets.
 13. A system as claimed in claim 1 wherein functionsapplied to the packets, to detect those which meet the defined criteria,are encoded into a function tree.
 14. A system as claimed in claim 1which is configured to detect suspect network traffic and provide alertsby means of an alert process in the event that an attack or potentialnetwork attack is identified.
 15. A system as claimed in claim 14wherein a logger forwards details of detected packets to the dataprocessing means for analysis, the logger also forwarding details ofhigh priority packets to the alert process.
 16. A system as claimed inclaim 14 wherein the data processing means includes a database serverhosting a database on which details of detected packets are stored, andat least one data series server which queries the database to createstatistics and provides information to the alert process.
 17. A systemas claimed in claim 16 comprising a plurality of data series serverswhich create different types of statistics.
 18. A system as claimed inclaim 14 wherein a historical analysis is carried out on details ofdetected packets stored by the data processing means.
 19. A system asclaimed in claim 14 wherein means are provided so that informationconcerning a number of detected packets which are related may beaggregated and a single alert provided.
 20. A system as claimed in claim14 wherein means are provided so that a list may be stored to identifynetwork attacks on which resources are to be concentrated.
 21. A systemas claimed in claim 14 wherein means are provided so that a list ofpermitted data may be stored and all packets which do not correspond tothis list are detected.
 22. A system as claimed in claim 14 whereinpatterns are identified in the stored data and used to predict the nextsequence of packets if an attack is identified.
 23. A system as claimedin claim 14 which is configured to detect a distributed denial ofservice attack as a deviation from normal traffic patterns.